Hysteresis and nonequilibrium work theorem for DNA unzipping 
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We study by using Monte Carlo simulations the hysteresis in unzipping and rezipping of a double 
stranded DNA (dsDNA) by pulling its strands in opposite directions in the fixed force ensemble. 
The force is increased, at a constant rate from an initial value go to some maximum value g m that 
lies above the phase boundary and then decreased back again to 30- We observed hysteresis during a 
complete cycle of unzipping and rezipping. We obtained probability distributions of work performed 
over a cycle of unzipping and rezipping for various pulling rates. The mean of the distribution is 
found to be close (the difference being within 10%, except for very fast pulling) to the area of the 
hysteresis loop. We extract the equilibrium force versus separation isotherm by using the work 
theorem on repeated non-equilibrium force measurements. Our method is capable of reproducing 
the equilibrium and the non-equilibrium force-separation isotherms for the spontaneous rezipping 
of dsDNA. 

PACS numbers: 87.14gk, 87.15. Zg, 36.20.Ey 
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I. INTRODUCTION 

Unzipping of a double stranded DNA (dsDNA), an 
essential step in biological processes like DNA repli- 
cation and RNA transcription, is carried out by en- 
zymes that exert an external force on the strands of the 
DNA The phenomenon has been studied both the- 
oretically @, Q and experimentally [3, Q by applying a 
pulling force on the strands of the DNA. In theoretical 
models, the strands of the DNA arc modelled either on 
the lattice, by random or self-avoiding walks, or in the 
continuum by worm like chains. It was found that when 
a pulling force is applied on a dsDNA, the two strands 
unzip if the force exceeds a critical value. Below the crit- 
ical force the DNA is in the zipped phase while above 
it the DNA is in the unzipped phase. The force can be 
applied on the DNA by cither keeping the separation be- 
tween the strands fixed (fixed distance ensemble) or by 
applying a fixed pulling force (fixed force ensemble) on 
the strands. For the later case, the separation between 
the strands fluctuates while it is the force needed to keep 
the separation fluctuates in the fixed distance ensemble. 
The unzipping of dsDNA by a pulling force is a first-order 
phase transition 0, Q . 

In a continuous phase transition large fluctuations in 
the order parameter are present near the transition re- 
gion that act as a precursor that something unusual is 
about to occur. In the case of DNA, if the melting is con- 
tinuous, there will be large fluctuations in the size and 
shape of the denatured bubbles along the chain. These 
fluctuations are absent in a first-order phase transition 
and the order parameter changes abruptly as the phase 
boundary is crossed. However, there is usually a hystere- 
sis associated with the first order transition which causes 
the change to occur at a point that is slightly displaced 
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from the phase boundary. This is because at a first order 
phase boundary the two phases can coexist and be sep- 
arated by interfaces. The energy of the interface acts as 
a barrier between two phases. Hysteresis is often linked 
to the dynamics of interfaces. Some aspects of interfaces 
in DNA have been discussed in Ref. [jjj. Near the phase 
boundary, there is a region of mctastability where the 
system can stay in its previous phase even after cross- 
ing the phase boundary. From the dynamics point of 
view, the relaxation time or the time scale to cross the 
barrier becomes large near the transition and therefore, 
there is a conflict between relaxation and the time scale 
of change of parameters. This produces hysteresis. A 
classic example of first-order transition in which hystere- 
sis has been studied in detail is the Ising model below 
the critical temperature in an external magnetic field Q ■ 
In recent years, hysteresis has been studied in unbind- 
ing and rebinding of biomolecules under a pulling force 
by using single molecule manipulation techniques [8l-fl0| 
because it can provide useful information on kinetics of 
conformational transformations, potential energy land- 
scape, and controlling the folding pathway of a single 
molecule (Tl| . 

The equilibrium statistical mechanics is a celebrated 
framework that gives the microscopic description of the 
thermodynamics of the system. However, one ma- 
jor challenge, often faced in designing experiments, is 
the requirement of thermodynamic equilibrium; the sys- 
tem should remain in equilibrium, or at least at quasi- 
cquilibrium, throughout the course of the experiment, 
and needs to be equilibrated whenever system param- 
eters are changed. However, in the last decade many 
remarkable identities, known as nonequilibrium work or 
fluctuation theorems (see Ref. [12| for a review), are de- 
veloped that bridges the gap between the nonequilibrium 
and equilibrium statistical mechanics. One of them is the 
Jarzynski equality ( JE) [l3| , which connects the thermo- 
dynamic free-energy differences between the two equi- 
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librium states (say A and B), AF = Fb — Fa, and the 
irreversible work done, W, in taking the system from one 
equilibrium state A to a non-equilibrium state having the 
same external conditions as that of the other equilibrium 
state B. The relation between AF and W is 



-AF/fc B T 



-W/k B T\ 



(1) 



where fee is the Boltzmann constant and T is the ab- 
solute temperature. The bracket (■ • • ) denotes average 
over all possible paths between A and B. Recently, Sad- 
hukhan and Bhattacharjee [l4[ have given a dynamics- 
independent nonequilibrium path integral formulation of 
the JE that can easily be generalized to cases involv- 
ing parameters like temperature and interactions. The 
applicability of JE has been questioned for boundary- 
switching processes. It was shown that for spontaneous 
irreversible processes (e.g., the adiabatic expansion of the 
gas into vacuum) the JE is not satisfied [15(. However, 
our simulations show that one can get the equilibrium in- 
formation by using JE and the multiple histogram tech- 
nique fl(i| even for the spontaneous rezipping of dsDNA. 

In this paper we study the hysteresis in unzipping and 
rezipping of a homopolymcr dsDNA when its strands are 
pulled in opposite directions by a force. The force g is 
increased, at a constant rate g = Ag/ At, from some ini- 
tial value go to some maximum value g m that lies above 
the phase boundary. The force g is then decreased back 
to go at the same rate. We observed hysteresis during 
a complete cycle of unzipping and rezipping. By using 
the work theorem on repeated non-equilibrium force mea- 
surements on the forward and the backward paths, we 
extract the equilibrium force-distance isotherm. We also 
show that our procedure can be used to obtain the com- 
plete equilibrium and nonequilibrium force-separation 
isotherms for the spontaneous rezipping of dsDNA. In 
this process, the dsDNA is initially at equilibrium with a 
maximum pulling force g = g m and the force is suddenly 
decreased to a minimum force g = go. 

The paper is organized as follows: In Sec. HH we de- 
fine our model and the details of the Monte Carlo sim- 
ulations. We compare the equilibrium results obtained 
by the simulation with the exact results known for the 
model. The results are discussed in Sec. IHII and summa- 
rized in Sec. IIVI 



II. MODEL 

The two strands of a homo-polymer DNA are repre- 
sented by two directed self-avoiding walks on a d = 1 + 1 
dimensional square lattice. The walks starting from the 
origin are restricted to go towards the positive direction 
of the diagonal axis (^-direction) without crossing each 
other. The directional nature of the walks takes care of 
self- avoidance and the correct base pairing of DNA, that 
is, the monomers which arc complementary to each other 
are allowed to occupy the same lattice site. For each such 
overlap there is a gain of energy — e (e > 0). One end of 




FIG. 1. (Color online) Schematic diagram of DNA unzipping. 
The strands of the DNA are shown by thick solid lines with 
a pulling force g act along x direction at end monomers. 



the DNA is anchored at the origin and a force g acts 
along the transverse direction (x-dircction) at the free 
end. The schematic diagram of our model is shown in 
Fig. HI 

Let D n (x) be the partition function in the fixed dis- 
tance ensemble of a dsDNA of length n with separation 
x between the nth monomers of two strands. D n (x) sat- 
isfies the following recursion relation: 



D n+1 (x) = [D n (x + 1) + 2D n (x) +D n (x - 1) 



[1 



-1)4 



,r.O 



, (2) 



with an initial condition Do(x) = e^ e 5 x fi- 

The above recursion relation can be solved exactly via 
the generating function technique fl7l4l9| . The aim is 
to calculate the singularities of the generating function. 
The singularity closest to the origin gives the phase of 
the DNA and the phase transition takes place when two 
singularities cross each other. The generating function 
for D n (x) can be taken of the form (ansatz) 



D(z,x) =J2z nD n(x) = \ x (z)A{z). 



(3) 



When used in the above recursion relation, we ob- 
tain X(z) = (1 - 2z - VI - 4z)/(2z) and A(z) = 
1/ [l - z (2 + X(z)) e^] . The singularities of X(z) and 
A(z) arc 1/2 and z% 



e /3e respec- 



v/l - e-^ - 1 
tively. The zero force melting, which comes from z\ = 
z 2 , takes place at a temperature T m = e/ln(4/3). In 
the large length limit, D n (x) can be approximated as 
-Djv(x) ps X x (z 2 )/z2 +1 , with the free energy @F = 
NhiZ2 — xlnA(z2)- The force required to maintain the 
separation x is then given by 



/ \ OF 
°W = Wx 



lnA(z 2 ), 



(4) 



where 2 in the denominator is the unit length of the di- 
agonal of the square lattice. In the fixed force ensemble, 
the generating function can be written as 

G(z, p, g) = e2p9X J2 znL> n(x) = e 2 ^ x X x {z)A{z) 

x n x 

A(z) 



l-X(z)e 2 P9' 



(5) 
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which has an additional force dependent singularity Z3 = 
l/[2 + 2 cosh (2/3g)]. The phase boundary comes from 
Z2 = Z3, and is given by 



sCO 



— cosh 1 



1 



1 



1 



(6) 

which is same as the phase boundary obtained in the 
fixed distance ensemble. The phase diagram for kg — 
1 is shown in Fig. H[a). The phase diagrams of DNA 
unzipping has been obtained previously in Refs. p7l - [l9| . 

For many other equilibrium properties, based on ther- 
mal averaging, the exact transfer matrix technique can be 
used to obtain numerically the partition function Dn(x) 
for the DNA of length N. This technique gives exact 
numerical results for finite N, which is rather difficult 
in the generating function method. To obtain Dn(x) at 
any given temperature we start with an initial condition 
Dq(0) = 1 and iterate the above recursion relation [i.e. 
Eq.@] N times. The equilibrium average separation be- 



tween the end monomers, 
by 



/cqj 



can then be obtained 



<•-<•> 



eq 



J2x x D N(x)e 



(7) 



which is shown in Fig. J^b) for a DNA of length N = 128 
at T = 1.0. 

In this paper, to go beyond equilibrium, we perform 
Monte Carlo simulations of the model by using the 
Metropolis algorithm, which allows us to study both equi- 
librium and off-equilibrium behaviors. The strands of 
the DNA undergo Rouse dynamics that consists of lo- 
cal corner-flip or end-flip moves [2(| that do not vio- 
late mutual avoidance (the self-avoidance is taken care 
by the directional nature of the walks). The elemen- 
tary move consists of selecting a random monomer from 
a strand, which itself is chosen at random, and flipping 
it . If the move results in overlapping of two complemen- 
tary monomers, thus forming a base-pair between the 
strands, it is always accepted as a move. The opposite 
move, that is, the unbinding of monomers, is chosen with 
the Boltzmann probability 7/ = cxp(— e/fcgT). If the cho- 
sen monomer is unbind, which remains unbind after the 
move is performed is always accepted. The time is mea- 
sured in units of Monte Carlo steps (MCSs). One MCS 
consists of 2N flip attempts, that is, on an average, every 
monomer is given a chance to flip. Throughout the simu- 
lation, the detailed balance is always satisfied. From any 
starting configuration, it is possible to reach any other 
configuration by using the above moves. Throughout the 
paper, without loss of generality, we have chosen e = 1 
and fcs = 1. 

To check if the results obtained by using the above 
mentioned moves are consistent with the analytical re- 
sults obtained previously, we calculate the force g vs 
equilibrium average separation (x) eq between the end 
monomers of the DNA of length N = 128 at T = 1. 
This is shown in Fig. HJb) by filled circles. Every data 



point is obtained by first equilibrating the system for 
2 x 10 5 MCSs and then averaged over 10 4 different real- 
izations. In the same plot we have also shown, by solid 
line, the force-distance isotherm obtained by using the 
exact transfer matrix calculations for the model. The re- 
sults of Monte Carlo simulations match excellently with 
the exact result. The equilibrium configurations of the 
dsDNA of length N = 128 at temperature T = 1 for two 
different forces g = 0.65, which lies just below the phase 
boundary, and g = 0.9, which is far away from the phase 
boundary (and also the maximum force used in this paper 
at T = 1) are also shown in Fig. (5Jc). These configura- 
tions show that the DNA is in the zipped phase (with a 
small Y-fork at the end) for the force below the critical 
force and in the unzipped phase for the force above the 
critical force. 

To study the hysteresis in DNA, we start the simula- 
tion with a valid configuration of a dsDNA of length N = 
128 at T = 1 and N = 256 at T = 3.6. The later tem- 
perature is above the melting temperature T rn ps 3.476 of 
the dsDNA for the model used in this paper. The system 
is first equilibrated with zero pulling force go = 0. The 
force g is incrementally increased from go to g m = 0.9 
at T = 1 (g m = 1.0 is used at T = 3.6) at a step of 
Ag F = 0.01 by using the following protocol 



9i = 9o + i&g F , 



(8) 



where i = 0, 1, 2, . . . , n with n = (g m — go)/Ag F is the 
number of steps between the initial and the final force 
values. The superscript F denotes the forward path. For 
the backward path (denoted by superscript B) the force is 
incrementally decreased from g m to go by Ag B = —Ag F . 
The number of steps n and the time interval At are kept 
same as that of the forward path. 

Each step of the process can be thought of two sub- 
steps. In the first substep, the force is increased by Ag F . 
Therefore, an amount of work AW F = —Ag F x F has to 
be performed on the system, where xf is the separation 
between the end monomers of the DNA at the beginning 
of the ith step. In the second step, the system is relaxed 
for the time interval At in the presence of the pulling 
force gi+i- The total work performed on the system dur- 
ing the complete forward process is 



W F = -Ag F J24- 



(9) 



Similarly, for the backward path the work performed by 
the system is 



W 1 



(10) 



i=0 



The above procedure is repeated many times to obtain 
various trajectories. For each realization, the system is 
initially equilibrated at go but no attempt has been made 
to equilibrate the system at the maximum force g m used 
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FIG. 2. (Color online) (a) Critical unzipping force as a function of temperature [Eq. (j4j]. (b) Force g vs equilibrium average 
separation {x) eq between the end monomers of the dsDNA at T — 1.0. The line is from the exact transfer matrix approach 
and points are from the Monte Carlo simulations, (c) Typical equilibrium configurations of the dsDNA of length N = 128 for 
force values g = 0.65 (lies just below the phase boundary) and g — 0.9 (far above the phase boundary) at T = 1.0 obtained by 
using Monte Carlo simulations. The lines are the bonds between monomers of the strands. 



in this paper. The total work done over a complete unzip- 
ping and rezipping cycle is given by the sum of the work 
performed along the forward and the backward paths 

W = W F + W B . (11) 

The work performed W is different for different realiza- 
tions. The following sign convention is adopted in this 
paper: The positive (negative) sign of work denotes the 
work done on the system (by the system). 



III. RESULTS AND DISCUSSIONS 

Before discussing our results let us fix some notations 
to avoid confusion. We are working in the fixed force 
ensemble and define the pulling rate by g = Ag/At. We 
fix the force interval to Ag = 0.01 (in magnitude) for 
both the forward and the backward paths and change the 
time interval At to change the pulling rate. Therefore, 
instead of giving the actual numerical value we just give 
the time interval At for the pulling rate. For example, the 
pulling rate for At = 100 means g = 0.01/100 = 10~ 4 . 



A. Hysteresis curves 

In Fig. OHa), we have shown, for two different realiza- 
tions, the force g versus separation x between the end 
monomers of dsDNA for the pulling rate for At = 1000. 
The forward and backward paths are shown respectively 
by open and filled symbols. These realizations reveal that 
the system does not get enough time to relax to the equi- 
librium and shows hysteresis. The average separation (x) 



at force g is obtained by 

1 M 

i=l 

for both the forward and backward paths. The result- 
ing hysteresis for various pulling rates averaged over 
M = 10 5 realizations are shown in Fig. Elb). When At 
is smaller (i.e., the pulling rate is higher), the system 
does not get enough time to respond to the change in 
the pulling force and the average separation between the 
strands is small even if the pulling force is greater than 
the critical force needed to unzip the DNA. On decreas- 
ing the force from the maximum, the separation between 
the strands initially increases because the force is still 
greater than the critical unzipping force and the system 
gets some more time to relax. Therefore, on the backward 
path there exist a force at which the average separation 
is equal to the equilibrium separation. This is the point 
at which the equilibrium curve cuts the backward path of 
the hysteresis loop in Fig. IHb). On decreasing the force 
further, the average separation decreases slowly and the 
system is again driven away from the equilibrium. If we 
join the forward and the backward paths we get a small 
hysteresis loop. The area of the loop gives the amount of 
heat that is deposited to the system. Since for the fast 
pulling rate (i.e., small At) only a small segment of the 
dsDNA can be opened by the force, the area of the loop 
is small. As At is increased, the system gets more and 
more time to respond to the change in the pulling force 
and the average separation between the strands at g m in- 
creases. This increase in separation continues initially on 
the backward path also and therefore the area of the hys- 
teresis loop increases as is visually seen from Fig. [3jb). 
But the area of the loop cannot increase forever with the 
increase of At. For sufficient large At, the average sepa- 
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ration between the strands at g m becomes closer to the 
equilibrium separation as can be seen for At = 10000 
curve. We will see later that for such cases, the noncqui- 
librium measurements along the backward paths can also 
be used to calculate the equilibrium curve. If At is very 
large, that is, the pulling is very slow, the system gets 
sufficient time to get equilibrated before the force is in- 
creased to a new value. Therefore, the system remains 
in equilibrium for all force values and does not show any 
hysteresis (zero loop area). This situation is shown in 
Fig. G3T) by filled circles. The area of the hysteresis loop 
calculated by integrating numerically using trapezoidal 
rule for various At values are tabulated in Table I, which 
confirms the above statement. 



At 


(W) 


a 


Area of 








hysteresis loop 


100 


10.3 ±0.3 


8.0 ±0.3 


12.636 


500 


25.6 ±0.2 


12.8 ±0.2 


28.057 


1000 


37.5 ±0.2 


14.3 ±0.2 


38.066 


5000 


26.17 ±0.03 


6.75 ± 0.03 


27.687 


10000 


16.33 ±0.02 


5.47 ±0.02 


18.029 


12000 


14.42 ± 0.05 


5.12 ±0.05 


16.138 



TABLE I. The average (W) and the standard deviation a of 
the probability distribution of work performed over an unzip- 
ping and rezipping cycle, and the area of the hysteresis loop 
for various At values. 
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FIG. 3. (Color online) (a) The separation x between the end 
monomers of the DNA as a function of force g for the forward 
(unzipping) and the backward (rezipping) paths for two dif- 
ferent realizations of a dsDNA of length N = 128 at T = 1 for 
the pulling rate At = 1000. The filled and the open symbols 
represent the forward and the backward paths, respectively. 
The work done over a complete cycle is negative (positive) 
for the realization shown by circles (diamond), (b) Hystere- 
sis curves for different pulling rates. The equilibrium curve, 
shown by the filled circles, does not show any hysteresis. The 
paths are averaged over 10 different realizations. 



B. Probability distribution of work performed over 
a cycle 

As stated in Sec. UH the work performed during a com- 
plete unzipping and rezipping cycle are different for dif- 
ferent realizations. There are a few realizations for which 
the work obtained during the rezipping process is more 
than the work performed during the unzipping process. 
One such realization is shown in Fig.^a) by circles. The 
second law of thermodynamics cannot be violated. But 
if one looks at an individual or a group of such trajec- 
tories, there is an illusion of a violation. However, for 
the majority of trajectories, which look more or less like 
the realization shown in Fig. [31(b) by diamonds, the work 
performed during the unzipping process is more than the 
work obtained during the rezipping process. The prob- 
ability distribution of work guarantees that the second 
law of thermodynamics is satisfied on an average. 

In Fig. |H we have plotted the probability distribution 
of work, P(W), performed over a complete unzipping and 
rezipping cycle for various At values. The solid lines are 




-10 10 20 30 40 50 60 70 80 90 

w 

FIG. 4. (Color online) Probability distribution P(W), of the 
work performed during a complete unzipping and rezipping 
cycle for various At values. The solid lines are the Gaussian 
fit to the distributions while the dashed line for At = 100 is 
guide to eyes. 
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the Gaussian fit to the data 



P(W) = Aexp 



(W-(W)Y 



2a 2 



(13) 



where, (W) and a, respectively, represent the average 
work performed during a cycle and the standard devia- 
tion of the distribution and A is the normalization con- 
stant. The values obtained for various At are tabulated 
in Table I. 

One can observe from the figure that for At = 100 (i.e., 
for faster pulling rates), the probability distribution de- 
viates from the Gaussian distribution. The asymmetry 
of the distribution is quite visible, so we have not shown 
the Gaussian fit for this data but joined data points by a 
dashed line to guide the eyes. The distribution is peaked 
towards lower W values and therefore the probability of 
obtaining negative work over a cycle is also higher. The 
average work is however positive. As the pulling rate 
decreases, the distribution becomes more and more sym- 
metric and tend towards the Gaussian distribution. The 
peak of the distribution first shifts towards higher W val- 
ues and it becomes broader as can be seen in the figure for 
At = 1000. On decreasing the pulling rate further, the 
mean and the width of the distribution again starts de- 
creasing. If the pulling rate is extremely slow, the system 
remains in equilibrium at all times during the forward 
and the backward path, and therefore the work done on 
the system during the forward path is exactly equal to 
the work done by the system during the backward path. 
The total work performed over a cycle is zero and the dis- 
tribution P(W) is sharply peaked at W = 0. Therefore, 
at equilibrium, the average work done on the system is 
exactly equal to the area of the hysteresis loop. In the 
noncquilibrium regime, except for very fast pulling rate 
(i.e., At = 100), the difference of the average work done 
on the system and the area of the hysteresis loop is within 
10% (see Table I) of the area of the loop. The differ- 
ence comes because only 10 5 trajectories are simulated. 
This many trajectories are good enough to calculate the 
average extensions and hence the area of the hysteresis 
curves. However, these are not good enough to calculate 
the probability distributions of the work, which are ob- 
tained by binning the data. If more realizations are used 
in obtaining the probability distributions the difference 
between the two quantities will become smaller. 

For a cyclic process, the initial and the final equilib- 
rium states are the same. Substituting AF = in Eq. (JTJ 
one gets 



Z 



-pw 



1 M 

— Y 



-/3Wi 



1. 



(14) 



i=l 



This equality strictly holds for infinite trajectories M —> 
oo. The convergence is often slow because it picks 
up exponential weights and the trajectories that con- 
tribute to the sum (i.e. having extreme weights) are 
rare. We calculate Z for At = 10000 case. On averag- 
ing over M = 10 2 , 10 3 , 10 4 , and 10 5 trajectories, we ob- 
tain Z = 0.0027, 0.0102, 0.0827, and 0.2784, respectively. 



This shows that Z increases with number of trajectories 
used in averaging. This direct sum over paths may not 
be a useful numerical way to estimate the free-energy 
or establish Eq. (fT4")) . However, a weighted average as 
discussed below is more efficient for averages. The prob- 
ability distributions P(W) obtained above can be used 
to calculate Z: 



Z 



dWe 



-pw 



P{W) = cxp -P{W) + 



(15) 

Substituting the value of (W) and a for At = 10000 from 
Table I, we get Z = 0.2542, which is somewhat smaller 
than the value obtained from Eq. (fT4"]) for 10 5 trajecto- 
ries. Equation (fl"5|) is not applicable for fast pulling (i.e., 
smaller At values) because of deviations from Gaussian. 
For those cases, higher cumulants are important. In the 
limit of infinite trajectories, P(W) will become the true 
probability distribution of work and the value of Z ob- 
tained by either methods will approach 1. 



C. Equilibrium curves from non-equilibrium 
measurements 

In this section, we discuss the procedure that can be 
used to obtain the equilibrium force-distance isotherms 
by using non-equilibrium measurements on the forward 
and the backward paths. Our technique is similar to 
that of Hummer and Szabo [2l|, but in the conjugate 
ensemble). The above method (Ref. [2l|) has been used 
successfully to obtain the zero force free energy on single 
molecule pulling experiments in constant velocity ensem- 
ble 

In Figs. [5{a), [5jb) and[5jc), we have plotted the hys- 
teresis loops for At = 5000, 10000, and 12000, and the 
equilibrium curve obtained by using the exact transfer 
matrix for N = 128 and T = 1, which is below the 
melting temperature T m = 3.476. In the same plot we 
have also shown the equilibrium data calculated from the 
noncquilibrium force measurements by using the proce- 
dure described below. 

We divide the forward path into intervals of sizes Ag. 
Let i and k represent respectively the indices for the sam- 
ple and the force. The irreversible work done over the ith 
non-equilibrium path, taking = at go, is given by 



Wu 



(16) 



By using exp(— j3Wik) as the weight for path i, the equi- 
librium separation between the end monomers of the ds- 



DNA, x ^ 1 , at force gp. can be obtained by 



Y,i=i x ik exp(-^Wife) 



E£iexp(-j8W«) 



(17) 



The above procedure has been used by Sadhukhan and 
Bhattacharjee 14| to obtain the equilibrium curve. 
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FIG. 5. (Color online) Hysteresis obtained in unzipping and rezipping of a dsDNA for various pulling rates at two different 
temperatures, (a) to (c) at T = 1.0 < T m and (d) to (f) at T = 3.6 > T m . The forward (shown by dashed line) and the 
backward (shown by dotted line) paths are averaged over 10 J realizations for (a) and (b) while for cases (c) to (f) paths are 
averaged over 10 4 realizations. The unfilled and filled circles represent the equilibrium curves obtained by using work theorem 
on nonequilibrium forward and backward paths, respectively. The solid lines represent the equilibrium curves obtained by using 
the exact transfer matrix technique. 



In our simulation, we have 10 5 Xk values, at each force 
gk- These values can be used in Eq. (fTT| to obtain x c ^ 
at gk- However, one can do better than this. For a given 
temperature T and force g, the system at equilibrium 
samples a narrow phase space given by the Boltzmann 
distribution. The distributions at nearby force values 
overlap with each other. The overlapping distributions 
can be properly weighted to obtain the approximate den- 
sity of states (DOS) of the system. This goes by the 
name of multiple histogram technique fl6| and has been 
exploited in simulations. Once the DOS is known, the ob- 
servables can be calculated at any other force. To achieve 
this we build up the histogram Hk(x) at force gk- For 
the ith realization, if the separation is x, we increment 
the corresponding histogram value by exp(— /3Wik), 

M 

H k (x)=J2 e ~ mik ( 18 ) 

i=l 



where, S XfXi = 1 if x = Xi, and zero otherwise. The 
partition function Zk at force gk , obtained by using the 
multiple histogram technique [16| , reads as 

z k = Y2 P( x ) exp(Pg k x), (19) 

X 

where 

= ^ LpSU ■ ( 2 °) 

is the DOS. Equation (fl"9| needs to be evaluated self- 
consistently. We take initial ZkS as 

1 M 

Z t = -^exp(-«, (21) 

»=i 

and iterate Eq. (jT9"]) till it converges to a true DOS. The 
initial values are motivated by the JE [see Eq. (JTJ)]. By 
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FIG. 6. (Color online) The equilibrium curves, shown by unfilled (filled) circles for the forward (backward) paths, obtained by 
using work theorem on nonequilibrium paths when the backward paths are also initially equilibrated at g m . The dashed and 
dotted lines are, respectively, the forward and backward nonequilibrium paths averaged over 10 4 realizations at T = 3.6 for 
various pulling rates. The solid line represent the equilibrium curve obtained by using the exact transfer matrix technique. 



using p(x), we can then evaluate the equilibrium separa- 
tion x c ^ at force g^ by 

x< k = ~^ exPiPakx)- (22) 

The equilibrium force-separation curve obtained by us- 
ing the data of 10 5 nonequilibrium forward paths for 
At = 5000 and 10000, and 10 4 paths for At = 12000, in 
above procedure is shown by open symbols in Figs. EJa)- 
GUc). The data for At = 10000 and 12000 match rea- 
sonably well with the equilibrium curve obtained by the 
exact transfer matrix method. The same procedure can 
also be adopted for the backward path. The equilibrium 
force-extension curve obtained by using the nonequilib- 
rium data for the backward path is shown by filled cir- 
cles which match excellently with the exact curve for 
At = 10000 and 12000. For the fast pulling case, that is, 
At = 5000, the results from the forward paths match all 
other points except at the transition region due to poor 
statistics in that region. The multihistogram approach, 
by analyzing the errors in the estimates of the partition 
function, tries to correct for the unexplored part of the 
phase space. Even for this to work, one needs good sam- 
pling or better statistics. The other approach is to gener- 
ate rare conformations that have dominant contributions 
in the weighted sum by using special algorithms f23T - |25j ] . 
It would be interesting to do a comparative study of 
the two approaches. Special algorithms arc important 
in generating trajectories for very fast pulling rates (e.g., 



At = 100, 500, etc.), where not enough rare configura- 
tions can be generated by simply including more trajec- 
tories due to low probabilities of such configurations. We 
have also shown, in Figs. [SJd)- EJf), the hysteresis and 
equilibrium curves obtained by using the above procedure 
for N = 256 at temperature T = 3.6 which is above the 
melting temperature of the DNA. These are obtained for 
the pulling rates with At = 1000, 2000 and 3000 respec- 
tively by averaging over 10 4 nonequilibrium paths. The 
equilibrium curve obtained by using the data of the for- 
ward paths match excellently with the exact curve while 
the curve that uses the data of the backward paths de- 
viates from the exact curve at higher forces. As stated 
previously, we have not made any attempt to equilibrate 
the system at the maximum force g m . Still, we could get 
an excellent match with the equilibrium curve by using 
the data for the backward paths for T = 1 but not with 
the data for T = 3.6. This can be understood by ob- 
serving the hysteresis curves near g m for both cases. For 
T = 1, with At = 10000 and 12000, the average separa- 
tion between the end monomers of the DNA at g m is quite 
closer to the equilibrium curve. Therefore, the system is 
practically in equilibrium at the beginning of the back- 
ward paths and one can use the work theorem [Eq. (JTJ] to 
obtain equilibrium properties. However, for T = 3.6, the 
system has not reached the equilibrium at g m and so the 
requirement of work theorem, that is, the system should 
initially be in equilibrium, is not satisfied for the back- 
ward paths and it cannot be applied for this case. If we 
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FIG. 7. (Color online) Equilibrium g versus (x) isotherms ob- 
tained by using work theorem on 10 3 , 10 4 , and 10 5 nonequi- 
librium trajactories for the pulling rate At = 100 at T = 3.6. 
The unfilled (filled) circles represent the forward (backward) 
path. The solid lines represent the exact equilibrium curve. 



allow initial equilibration at the maximum force g m then 
the work theorem can be applied for the backward paths 
also and the weighted data on the backward paths should 
give us the equilibrium curve. This we have explored at 
T = 3.6 for various At values in Fig. [5] One can see from 
the figure that for At = 500, 1000, and 2000, the curves 
obtained by using our procedure on 10 4 noncquilibrium 
trajectories match reasonably well with the exact curve. 
For At = 100, which is the fastest pulling rate reported 
in this paper, the curves deviate from the exact curve at 
other ends but this is due to the poor statistics in that 
region. In Fig. we have shown how these curves behave 
when the weighted average is taken over a different num- 
ber (10 3 , 10 4 and 10 5 ) of trajectories. On including more 
trajectories, we can see that the curves move towards the 
exact equilibrium curve. To get better statistics in the 
fast pulling regime, one can use algorithms [23Tj25| men- 
tioned previously to generate rare trajectories. These 
can be included in the above procedure to get a better 
match but we have not attempted it in this paper. It 
is clear from the above discussion that the above proce- 
dure can be used successfully to obtained the equilibrium 
force-separation isotherms from the noncquilibrium mea- 
surements when a pulling force is applied on the strands 
of the dsDNA that varies with a constant rate. 

To check the robustness of the above method, we study 
the spontaneous rezipping of dsDNA. The DNA is ini- 
tially equilibrated with a pulling force g m = 0.725 at a 
temperature T = 1. The force g m lies above the phase 
boundary (but close to it) and the dsDNA is in the un- 
zipping phase. The force is suddenly decreased to a lower 
value go that lies below the phase boundary in which the 
DNA is in the zipped phase (i.e., a different boundary 
condition than g m )- From the initial to the final force, the 
system is equilibrated for 1 MCS (i.e., At = 1). It is inter- 
esting to know if the method discussed above is capable of 
predicting the equilibrium and the noncquilibrium force- 



FIG. 8. (Color online) Equilibrium and nonequilibrium g 
versus (a;) isotherms obtained by using a work theorem on 
10 4 trajactories when the pulling force on the DNA decreases 
spontaneously from g m — 0.725 to various go values as indi- 
cated. The unfilled (filled) symbols represent the equilibrium 
(nonequilibrium) curves. The solid line represent the exact 
equilibrium curve and dotted lines are a guide to eyes. 



separation curves at various other force values. In Fig. [5] 
we have shown the curves obtained by using the above 
method with 10 4 trajectories when the force is sponta- 
neously decreased from g m = 0.725 to go = 0.6, 0.4, 
0.2, and 0.0. The equilibrium (noncquilibrium) curves 
are shown by the unfilled (filled) symbols. An excellent 
match with the exact force-separation isotherm shows 
that the above method can be used to predict the equi- 
librium force-separation isotherms from the spontaneous 
rezipping of dsDNA. The application range of the above 
method is limited by how good statistics of the end sep- 
aration one can get. The maximum force g m = 0.725 
is chosen near the phase boundary because, even though 
the DNA is in the unzipped phase, many configurations 
exist, which is needed to sample the whole phase space. 
If we take g m far away from the phase boundary, the 
unzipped phase will only have fully stretched configura- 
tions and isotherms extracted by the above method will 
deviate from the exact result at smaller force values due 
to poor statistics in that region. The fact that the spon- 
taneous unzipping is capable of reproducing the exact 
equilibrium force-separation isotherms makes the above 
method an excellent candidate for analyzing the pulling 
experiments. It, however, needs to be explored further. 
The detailed analysis will be published elsewhere. 



IV. CONCLUDING REMARKS 

To summarize, we have studied the hysteresis in un- 
zipping and rezipping of a dsDNA in the fixed force en- 
semble. We found that the area of the hysteresis loop 
depends on the pulling rate. For fast pulling the area 
of the loop is smaller. On decreasing the pulling rate, 
the area of the loop first increases, and then starts de- 



10 



creasing due to the system's proximity to the equilibrium 
for the sufficiently slow pulling rates. On decreasing the 
pulling rate further, the system remains in equilibrium 
at all intermediate force values and the area of the loop 
becomes zero. We obtained the probability distributions 
of work performed over a complete unzipping and rezip- 
ping cycle for various pulling rate. The average of this 
distribution is found to be very close (the difference being 
within 10%, except for very fast pulling i.e. At = 100) to 
the area of the hysteresis loop. We also discussed a pro- 
cedure to obtain equilibrium force-distance isotherms by 
using repeated non-equilibrium measurements on the for- 
ward paths. We found that if the pulling rate is such that 
the average separation between the end monomers at the 
maximum force used is close to the equilibrium curve, 
the backward path gives better results than the forward 



paths. Furthermore, using our technique, one can obtain 
the complete equilibrium and the non-equilibrium force- 
separation isotherms for the spontaneous rezipping of ds- 
DNA. We believe that our multiple histogram based algo- 
rithm using work theorem can be implemented in molec- 
ular manipulation machines to provide both the equi- 
librium and the nonequilibrium information for pulling 
experiments. 
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